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Weighted Context-Free Grammars 
Over Bimonoids 


George RAHONIS!, Faidra TORPARI? 


Abstract 


We introduce and investigate weighted context-free grammars over 
an arbitrary bimonoid Kk. Thus, we do not assume that the operations 
of K are commutative or idempotent or they distribute over each 
other. We prove a Chomsky-Schtitzenberger type theorem for the series 
generated by our grammars. Moreover, we show that the class of series 
generated by weighted right-linear grammars over a linearly ordered 
alphabet © and K coincides with that of recognizable series over 
and Kk. 
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1 Introduction 


Weighted models of computation assign quantitative features to computa- 
tional processes. For instance finite automata examine whether an input 
word is accepted or not whereas weighted automata provide information 
for the cost of the computation, energy consumption, probability of the 
implementation of the computation, etc. On the other hand, context-free 
grammars constitute the main generative model with interesting applica- 
tions in compilers’ development (cf. for instance [17]), model checking [15], 
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parameterized verification [5], and runtime verification [23]. Both, weighted 
automata and weighted context-free grammars have been widely studied 
over semirings. We refer the reader to [7, 12, 13, 18, 24] for theory and 
applications of weighted automata over semirings, and to [2, 3, 14, 25] for 
several results on weighted context-free grammars over semirings. 

The last years, it came up that the semiring structure is not sufficient 
to describe operations needed in modern practical applications, like for 
instance the average operation. Therefore, several authors built the theory of 
computational models over more general structures, namely strong bimonoids 
and valuation monoids [4, 10, 11, 19, 27]. More recently, McCarthy-Kleene 
logic contributed to the development of an application for runtime verification, 
within the projects LogicGuard I and LogicGuard II [20, 21]. The extension 
of that tool, for future applications, required a fuzzy type of McCarthy- 
Kleene logic as well as weighted computational models. It was proved that 
the reasonable weight structure is a particular zero-sum free and zero-divisor 
free bimonoid with only left multiplicative zero. Weighted automata over 
that bimonoid were investigated in [8, 9]°. 

It is the goal of this paper to introduce and investigate a weighted 
context-free grammar model over an arbitrary bimonoid. Since the commu- 
tativity of operations of the weight structure is not required, we equipped 
the rule sets of our grammars with a linear order. Our main results are as 
follows. 


e We prove a Chomsky-Schiitzenberger theorem for the class of series 
generated by weighted context-free grammars. As an intermediate 
step, we show a folklore result, namely for every weighted context-free 
grammar we can effectively construct an equivalent one in Chomsky 
normal form permitting also ¢-rules. 


e We consider weighted right-linear grammars and show their expressive 
equivalence to weighted automata models of [8, 9], over an arbitrary 
bimonoid. For this, we require that the input alphabet is linearly 
ordered. 


The structure of our paper is as follows. Apart from this Introduction the 
paper contains four sections. In Section 2 we recall notions needed in the 
sequel and an example of a zero-sum free and zero-divisor free bimonoid 


3In [8, 9] weighted automata were called MK-fuzzy automata due to the specific weight 
structure motivated by the fuzzification of McCarthy-Kleene logic. 
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with left multiplicative zero [8, 9]. In Section 3 we introduce our weighted 
context-free grammars and show the closure of the class of their series 
under sum. Furthermore, we show that the class of series generated by 
unambiguous grammars is closed under multiplication with scalars from the 
right, provided the bimonoid has left multiplicative zero. Then, in Section 4 
we prove the Chomsky-Schtitzenberger type theorem, and in Section 5 we 
show the expressive equivalence of weighted right-linear grammars and 
weighted automata. Finally, in the Conclusion we refer to open problems for 
future research. 

The results of the paper were presented at 9th International Workshop 
Weighted Automata: Theory and Applications, WATA 2018 [28]. 


2 Preliminaries 


Let & be an alphabet, i.e., a finite nonempty set. We denote by %* the 
set of all finite words over %, i.e., the free monoid generated by © and set 
ut = y* \ {e}, where ¢ is the empty word. Assume now that < is a linear 
order on ©. As usually, we let a < b iffa <b and a#b, and we keep this 
notation for every order defined in the paper. The lexicographic order <jex 
on »&”* is defined as follows. We let w <jex wu iff 


(u= wv with v € b*) or 


(w =vav', u=vbu" with v,v',v" € d*, a,b € D anda <b) 


for every w,u € d*. 

In the sequel, the lexicographic order is defined, as for 4*, on the free 
monoid generated by any finite linearly ordered set. 

A bimonoid (K,+,-,0,1) (cf. [10]) consists of a set K, two binary 
operations + and - and two constant elements 0 and 1 such that (K,+,0) 
and (K,-,1) are monoids. The bimonoid is denoted simply by K if the 
operations and the constant elements are understood. If no confusion arises, 
we shall denote the - operation also by juxtaposition. The bimonoid kK 
has left (resp. right) multiplicative zero if 0 acts as a left (resp. right) 
multiplicative zero, i.e., 0-k = 0 (resp. k-0 = 0) for every k € K. If the 
monoid (A, +,0) is commutative and 0 acts as a left and right multiplicative 
zero, then the bimonoid is called strong. A semiring is a strong bimonoid 
where multiplication distributes over addition. 

Next we recall from [8, 9], an example of a bimonoid with left multiplica- 
tive zero. The set of the monoid consists of quadruples of the form (t, f, u, e) 
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where t, f,u,e € [0,1] and t+ f+u+e=1. The operations Li and Nl of the 
monoid are motivated by the fuzzification of disjunction and conjunction of 
McCarthy-Kleene logic [16, 22], respectively. Weighted automata over that 
bimonoid were introduced and studied in [8, 9] as computational models for 
the quantitative runtime verification within the LogicGuard projects [20, 21]. 


Example 1 [8, 9] We consider the set 
K={@,f, He) € ll te feus+es i} 


and define on K two binary operations U andl as follows. For every 
ky = (t1, fi, U1, €1) € K,kg = (to, fo, ua, €2) € K we let k3 = ky Ukg and 
k4 = ki ke where kg = (ts, f3, ug, e3) and k4 = (ta, fa, us, ea) are defined 
by the relations 


#3 =t1+ (fit ui)te t4 = tyte 

fs = fife fa=fit(iitu)fe 
ug = fru + ur(f2 + u2) ug = tyug + uy (te + u2) 
eg =e, + (fi + uy )e2 €e4 =e, + (ty + U1 ea. 


Then the structure (K,U,,0,1), where 0 = (0,1,0,0) and 1 = (1,0,0,0), 
is a bimonoid with left multiplicative zero. Furthermore, it is zero-sum free 
and zero-divisor free, 1.e., for every k,k' € K, kUk’ = 0 impliesk =k’ = 0, 
and k"1k’! = 0 implies k = 0 or k’ = 0, respectively. 


We refer the reader to [26] for further examples of bimonoids. 
Throughout the paper (K,+,-,0,1) denotes an arbitrary bimonoid. 


Let Q be a set. A formal series (or simply series) over Q and K is 
a mapping s: Q > K. The support of s is the set supp(s) = {q € Q | 
s(q) # O}. A series with finite support is called also a polynomial. The 
constant series k (k € K) is defined, for every q € Q, by k(q) = k. We 
denote by K ((Q)) the class of all series over Q and K. Let s,r € K ((Q)) 
and k € K. The sum s+r, the products with scalars ks and sk, and the 
Hadamard product s©r are defined elementwise, respectively by (s+r)(q) = 
s(q) + r(q), (ks)(q) = ks(q), (sk)(q) = s(@)k, (8 ©r)(q) = 8(q)r(q) for every 
q € Q. Trivially, the structure (K ((Q)) ,+,©,0,1) is a bimonoid. As usual, 
we write a series s € K ((Q)) in the form s = }) <9 8(4)-4¢. 
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3 Weighted Context-Free Grammars 


In this section we introduce the concept of weighted context-free grammars 
over the bimonoid K and state two closure properties of the class of their 
series. For this, we recall firstly the notion of a context-free grammar. More 
precisely, a context-free grammar is a quadruple G = (©, N, S, R) where © is 
the alphabet of terminals, N is the alphabet of variables (or nonterminals) 
with UO N = 9, S € N is the initial variable, and RC N x (XU N)* is 
the finite set of production rules (or simply rules). We use capital letters 
A, B,C, etc. to denote the elements of N. As usual, a rule (A,v) € R is 
also written as A > v. A rule r € RB is called a chain rule if it is of the 
form r = (A > B) with B € N, and it is called an e-rule if it is of the 
form r = (A — ¢). The elements of (XU N)* are called sentential forms 
of G. We define the direct derivation relation =>gC (SU N)* x (EU N)* 
as follows. Let w,u € (2 UN)*. Then, we set w = $g u iff w = wi Aug, 
U = WyVW2 With w1,w2 € (XU N)* and there is a rule r = (A > v) € R. 
Sometimes we also write w +g u if we want to denote that w directly 
derives u with the application of rule r. The direct derivation w = >g wu is 
called leftmost if w; € &*. In the sequel, the term derivation refers only to 
leftmost direct derivations and if no confusion arises we simply write ==> 
instead of =>g. As usual, we denote by ==>* the reflexive and transitive 
closure of = >. Sometimes by a derivation of G we refer to a finite sequence 
d=19..-Tr_-1, n= 1, of rules 7; € R, O <i < n—1, such that there are 
sentential forms w; € (UU N)* with w; = wi41 for every 0<i<n—1.In 
this case we write wo zs Wn. For every A € N, w € &* an A-derivation 
of w is a derivation d of G such that A =, w. We denote by D(A, w) the set 
of all A-derivations of w and we simply write D(w) for A = S. The language 
generated by G is L(G) = {w € &* | D(w) £ 0}. A language L C »* is called 
context-free if there is a context-free grammar G = (1, N,S,R) such that 
L= L(G). A context-free grammar G = (5, N,S, R) is called unambiguous 
if |(D(w)| < 1 for every w € &*. 

For our weighted context-free grammars we shall need the notion of 
a loop-free derivation. More precisely, a derivation wo 4, Wy with d = 
ro...Tn—1 is called loop-free if there are no indices 0 <j <k<n-—J1 and 
A€N such that A’ A. We shall denote by If D(A, w) (resp. lf D(w)) 
the set of all loop-free derivations in D(A,w) (resp. in D(w)). It should 
be clear that for every A € N and w € »&* the set If D(A,w) is finite. 
Furthermore, if G is unambiguous, then every derivation of G is loop-free. 
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Definition 2 A weighted context-free grammar (wcfg for short) over © and 
K is a five-tuple G = (%,N,S,R,wt) where (%,N,S,R) is a context-free 
grammar such that the set R of rules is assumed to be linearly ordered and 
wt: R— K is a mapping assigning weights to the rules. 


Let w —& u with w,u € (SUN)* and d=19...Tn-1, 71 € R,OK< 
i <n-—1. The weight weight(d) of d is determined by weight(d) = 
wt(ro):...: wt(rn-1). The series ||G|| € K ((x*)) of G is defined in the 
following way. Let w € &* and assume that If D(w) = {di,...,dm} 4 0 
with dy <jex --- <Jex dm. Then, we set 


Gil (w) = So weight(di) 


1<i<m 


where we sum up in an ascending order according to the usual ordering of 
natural numbers. If /fD(w) = @, then we let ||G|| (w) = 0. We also say 
that ||G|| is generated by G. Since lf D(w) is finite for every w € X* the 
value ||G|| (w) is well-defined. A series s over and K is called contezt-free 
if there is a wefg G over © and K such that s = ||G||. Two wefg G,,G2 over } 
and K are called equivalent if ||G1|| = ||G2||. 

A wefg G = (%, N,S,R,wt) over © and K is called unambiguous if the 
underlying context-free grammar (%, N,S,R) is unambiguous. 


Proposition 3 Let s1,s2 € K ((X*)) be context-free series. Then, the sum 
81 + Sq is also a context-free series. 


Proof: Let G; = (4, Nj, S;, Ri, wt;) be wefg over © and K such that 
s; = ||G;|| for i = 1,2. Without loss of generality, we assume that NiN No = 0 
and consider the wefg G = (%, N,S,R,wt) over © and K with N = NU 
NU {S} where S is a new variable, R= Ry U Ro U{S > $1, 5 > So}, and 


wty(r) ifre Ry 
wr(r) =< wto(r) ifre Ro 
1 ifr = (S > $,) orr =(S > S2) 


for every r € R. 

We define a linear order on the set R of rules in the following way. 
We preserve the orders of R, and Ry and we set S + $; < min R; and 
max Ry < S > So < min Ro. 
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Let w € &* such that lf D(w) 4 @. By definition of the set R, we trivially 
get that 1 f D(w) = 1f Di (w)UIlf D2(w) where If D;(w) denotes the set of loop- 
free derivations of the form S =>g S; ==>, w, for i = 1,2. Furthermore, 
the order of R implies ||G|| (w) = ||G1|| (w) + ||Ge|| (w). If lf D(w) = 0, then 
If Di(w) = 1 fD2(w) = 0, hence again we get ||G|| (w) = |||] (w) + ||Gell (~), 
and our proof is completed. 


Proposition 4 Let us assume that the bimonoid (K,+,-,0,1) has left mul- 
tiplicative zero. Let also s € K ((%*)) be the series of an unambiguous wefg 
over % and K, andk © K. Then sk is the series of an unambiguous wefg 
over % and K. 


Proof: Let G = (=, N,S,R,wt) be an unambiguous wefg over © and K 
such that s = ||G||._ We consider the new variables S’, A; and the wefg 
G’ = (&, N’, S’, R’, wt’) over % and K with N’ = NU{S’,A;,}, R' = RU 
{S" + SA, Ap > Ee}, wt'(r) = wt(r) for every r € R, wt! (S’ > SAx) = 1, 
and wt'(A, — €) = k. We extend the order on R to an order on R’ by 
letting S’ > SA, < min R and max R < A, > €. Let w € d*. Since G is 
unambiguous we get |/f(D(w)| < 1. Let us assume that | f D(w) = {d} and 


S = w. By construction of R’, we get that the derivation 
ad :S' = g! SAr a wAp, == g' W 


of G’ for w is unique. By definition of wt’, we get weight(d') = weight(d)k, 
hence ||G’|| (w) = (sk)(w). If lfD(w) = 0, then trivially there is no S’- 
derivation of G’ for w, and since 0 acts as left multiplicative zero, again we 
get (sk)(w) = s(w)-k =0-k = ||G’|| (w), and we are done. 


4 Chomsky-Schttzenberger Theorem 


In this section we show that a Chomsky-Schiitzenberger type result holds 
for the class of series generated by wcfg over © and K. For this, we prove 
firstly a folklore result, namely for every wcfg we can effectively construct 
an equivalent one in Chomsky normal form. Our definition for Chomsky 
normal form follows the one in [1], hence we permit ¢-rules. 


Definition 5 A wecfg G = (5,N,S,R,wt) over © and K is said to be in 
Chomsky normal form if every rule r € R is of the form r = (A > BC) or 
r=(A- a) with B,C EN andaec NU {e}. 
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By Definition 5, we get that if G is in Chomsky normal form, then it 
has no chain rules. We shall need the following lemma. 


Lemma 6 Let G = (%,N,S,R,wt) be a wefg over % and K. Then we can 
effectively construct an equivalent wcfg G’ without chain rules. 


Proof: If G contains no chain rules, then we set G’ = G. Otherwise, 
we construct a wefg G’ = (5, .N’,S, R’, wt’) over © and K as follows. We 
consider a new symbol Y and let N’ = NU{Y}. Then, we define R’ by 
adding a new rule Y > « to R, and replacing every chain rule A> BER 
by a new rule A > BY. The weight mapping wt’ coincides with wt on 
the non-chain rules of R and we set wt/(A > BY) = wt(A — B) for every 
A> Be Rand wt'(Y > ¢) =1. Finally, we extend the linear order 
on R to a linear order on R’ by taking the order of R and replacing the rule 
A-—- B by the rule A > BY, and letting Y > € be the maximum element 
of R’. Then, it is straightforward to show the equivalence of G and G’. 


Proposition 7 Let G = (%,N,S,R,wt) be a wefg over © and K. Then, we 
can effectively construct an equivalent one in Chomsky normal form. 


Proof: By Lemma 6 we assume that G contains no chain rules. For every 
a € &, we consider a new variable X, and a new rule Xz — a. We set 
R, = {Xq > a| a € 5} and define an arbitrary linear order on it. Next, let R 
comprise all rules of R of the form A > ujayuzag... upagugy1 with k > 1, 
G1,---,@, € Y, and uy,...,Ugs1 € N* such that ujueag...apugy1 F €. 
We replace in R every rule A > ujayuzga2... URaEURL1 E R by a new rule 
A uy Xq,U2Xa..-.UpXa,UR+1, and obtain the set of rules R. We define 
a linear order on R by taking the order on R and replacing every rule 
of the form A + wa)... UpGKURL1 E R by its corresponding one A > 
Uy Xq, -.-URXa,Uk+1- Moreover, we let max R < min R;. Now, we consider 
the wefg G’ = (X, N’, R’, S, wt’) over © and K with N’= NU{X, | ae xX} 
and R’ = RU R;. The weight mapping wt! is determined by 


wt(r) ifreR\R 
wt (r)= 4 wt(A uia1...ugapupyi) ifr = (A uw Xa,---UpXa,UE+1) 
1 ifre R 


for every r € R’. 
We aim to show that ||G’|| = ||G||. Indeed, let w € 4* and assume that 
lf D(w) = {d,...,d,} is the set of all loop-free derivations of G for w with 
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dy <lex --- <tex dn. By definition of G’, there is a derivation di, of G’ for w, 
which corresponds to d; for every 1 <i <n, and vice versa. Moreover, we 
trivially get 

weight(d,) = weight(d;). 


It remains to prove that d <jex ... <tex dj,. For this, we fix anl<i<n 
and let dj =71...rmru and dj41 = 11...rmpt with r1,...,7_m € R, r= (A> 
z)€R,p=(A->y)€R, u,t € R* andr < p*. This implies that z F y, 
and 

STE3"6 w Av Sg wy xv gw 


for d;, and 
Ss "ng w Av Py wiyv—=>Gg w 


for dj+1, where w; € S*, and z,y,v € (HUN)*. 
By construction of G’ we get that 


/ 


Toast , 
S 7KS*o wy Au! Seq we! = w 


for dj, with r, 2.57, CR, =(A>2) eR yu eb", 2,0 €(ZUN')* 
and k > m, where a’ = x if x € UU {e}, otherwise 2’ is obtained from x by 


replacing every letter a € & by the variable X,, and 


r’. 


th. 3 , 

S "LG, w Av’ so, wry’ =>) Ww 
for dj,,, with rj,,....7% € Rip =(Ary') € Rw € yu’ € 
(=U N’)*, where y/ = y if y € NU {e}, otherwise y’ is obtained from y by 
replacing every letter a € ¥ by the variable Xq. 

Since r < p, taking into account the order of R’, we get r’ < p’ which in 
turn implies that di <lex di 41: 

If lf D(w) = 0, then obviously there is no derivation of w in G’. We 
conclude ||G’|| = ||G||, as required. 

Next for every rule r’ € R’ of the form r’ = (A > B, By... By) with 
k; > 3, we consider the new variables Y,, Yo,..., Yz_2, and the new rules 


pw? = (A —> ByY};) 
pe) = (Y; > BoY2) 


“The case dj = T1..-Tm and dig4a = 11...Tmv with v € R* does not occur since 


dj ee : 
S =>g w means that the derivation terminates. 
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pe, = (Yp_2 > By_1Bx). 
Then, we replace in R’ the rule r’ by its corresponding above list of rules 
and obtain a new set of rules R”. We define a linear order on R” by taking 


the order of R’ and replacing the rule r’ by the linear order 


pe) <p) <...< pl. 


We let also N” be the set N’ with the new variables obtained in the above 
procedure. Now, we consider the wefg G” = (X=, N”, S, R”, wt’) over © and K 
where the weight mapping wt” is defined as follows. It coincides with wt’ on 
the rules of R” which belong to R’ and, keeping the above notations for 1’, 


we let , 
wt" (v! ’) =wt'(r’), and 


wt” (oY) =.= a (of) = 


It should be clear that G” is in Chomsky normal form. Furthermore, let 
w € &* such that the set of derivations If D'(w) = {d{,...,d),} of w in G’ 
is nonempty. Trivially, for every derivation dj, 1 < i < n, there is a 
unique derivation d! of w in G” and vice versa. Furthermore, weight(d/’) = 
weight(d’,) and by construction of R”, by standard arguments, we get df <tex 
++ <Jex a whenever di <jex --- <lex d,. If lf D’(w) = 0, then obviously 
there is no derivation of w in G”. We conclude that ||G”|| = ||G’||, and our 
proof is completed. 


For our Chomsky-Schiitzenberger theorem, we still need some pre- 
liminary matter. First, we recall the notion of Dyck languages. More 
precisely, let Y be an alphabet and Y = {7 | y € Y} a copy of it. Then, 
the Dyck language over Y, denoted by Dy, is the context-free language 
generated by the grammar Gy = (Y UY,N,S,R) with N = {S} and 
R={S— > yS¥y|yEeVY}uU{S > SS, Se}. 

A polynomial s € K ((X*)) is called a monome if |(supp(s)| < 1. We 
denote by K|XU {e}] the set of all monomes whose support is a subset of 
SU {e} [11]. 

Let I be an arbitrary index set and (s;),-; a family of series in K ((%*)). 
For every w € &* we let I, = {i € I | s;(w) # 0}. Then, the family (s;),-; 
is called locally finite if the set I, is finite for every w € &* (cf. [6]). 
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Let h: A > K[SU {e}] be a mapping. The alphabetic morphism 
induced by h is the mapping h : A* — K ((=*)) such that for every 
n> 1, 60,...,6n-1 € A with h(d;) = ky.a;, ky © K, aj © UU {e}, we 
have h(d9...dn—1) = ko... kn—1-40---@n—1 and h(e) = 1.e. We should note 
that h(v) is a monome for every v € A%*. If A is linearly ordered and L C A* 
such that the family (h(v)),,<7, is locally finite, then we let h(L) = )0,<, h(v) 
where we sum up in an ascending order according to the lexicographic order 
on A* induced by the linear order of A. Now we are ready to prove our 
Chomsky-Schiitzenberger theorem. 


Theorem 8 (Chomsky-Schiitzenberger) Let s € K ((=*)) be a context- 
free series. Then, there is a linearly ordered alphabet Y UY, a recognizable 
language L over Y UY, and an alphabetic morphism h: YUY + K[SUfe}] 
such that s = h(Dy NL). 


Proof: Let G = (5, N,5S,R,wt) be a wefg over © and K such that 
s = ||G||. By Proposition 7 we can assume that G is in Chomsky normal 
form. Following the proof of Theorem G.1, page 199 in [17], we define for 
every r € R the new letters y17, Wr, Y2,r, Y2,r and a new rule r’ which is 
determined as follows: 


ss A> yrBYiry2rCiar ifr =(A-> BC) with B,C EN 
lL AS yr ifr =(A >a) withae DU fe}. 


We let Y = {y1r, yar |r € R} and Y = {Yir, Yar | r € R} and consider the 
context-free grammar G’ = (Y UY,N,S, R’) with R’ = {r’' | re R}. We 
define a linear order on R’ by letting r{ < r4, whenever r; < rg for every 


r,r € R’. Furthermore, we define a linear order on Y UY by setting 


Y1ory < Yury S Yr, < Yar, < Y1,re < Yr < Y2,re < Y2,r2 


whenever 
ry S12 


for every 71,72 € R. Obviously, L(G’) C Dy. Moreover, by construction G’ 
is unambiguous, and by the aforementioned proof in [17], we get that there 
exists a recognizable language L over Y UY such that L(G’) = Dy 9 L. 

Next we consider the alphabetic morphism h induced by the mapping 
h:YUY + K[DU {e}], where 


Le if y € {Yiirs Y2,r | re R, t= 1,2} 
hy) = 4 wt(r).e ify=y1, and r=(A—- BC) with B,CEN 
wt(r).a ify=yi, and r=(A—-a) witha ce UU {e} 
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for every y EC YUY. 

Let now w € &* and assume that If D(w) = {di,...,dn} is the set of 
all loop-free derivations of G for w with dy <Jex ... <Jex dn. By construction 
of G’, there is a derivation d', of G’ which corresponds to d; for every 1 <i <n, 
and vice versa. Moreover, taking into account the linear orders of R and R’, 


d! 
we trivially get dy <lex --+ <lex dj,. Let us assume now that S = sg u; 
with u; € (Y UY)* for every 1 <i <n. We prove that uy <jex --- <lex Un- 
For this, we fix an index 1 <i < n—1 and show that uj; <tex ui41. Let 
Gaetan and d= Tso t Pe where r),..05%m € Ror =A 
yirt) € R’,p' = (A yipt) € RB’ with r’ < p’, and v’,z’ € R*. Then we 
get 
S “em u Af so) uy ef =! Ui 
for di, and 
S “Le wAf yo u Yt ptf =>) Ui41 
for d,.,, with uw’ € (YUY)*. 
Since r’ < p! we get r < p, and thus y1,r < yi which implies that 
Ui <lex Ui+1- 
Then, for every 1 <i < n, we have h(u;) = weight(d;).w. Moreover, 
by construction of G’, it holds w ¢ supp(h(u)) for every u € (Dy NL) \ 
{u1,..-,Un}. Therefore, we get 


|G ||(w) = weight(d,) +... + weight(dn) 
= h(uz)(w) +... + h(un)(w) 


= DY) h(u)(w) 


u€EDyNL 


={ Sd) Alu) | (w) 


ueDyNL 
Dw AEG: 


If if D(w) = 0, then {u € Dy NL | w € supp(h(u))} = 0. We conclude that 
s = h(Dy 1 L), and our proof is completed. 


Corollary 9 Let s € K ((X*)) be a context-free series. Then, there is a 
linearly ordered alphabet A, an unambiguous conteat-free grammar G over A, 
and an alphabetic morphism h: A > K|[XU {e}] such that s = h(L(GQ)). 
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Proof: | We obtain our result by Theorem 8 by letting A = Y UY and 
G=¢'. 


In [11] the authors proved a Chomsky-Schiitzenberger theorem for wefg 
over unital valuation monoids. Since strong bimonoids are particular unital 
valuation monoids their Chomsky-Schiitzenberger theorem holds for wefg 
over strong bimonoids. On the other hand, strong bimonoids are particular 
bimonoids, hence Chomsky-Schiitzenberger theorem for wcfg over strong 
bimonoids is implied also by our corresponding result. Nevertheless, we 
require that the alphabet of the involved Dyck language as well as the rules 
of wcfg to be linearly ordered sets. Therefore, our theory results to a weaker 
Chomsky-Schiitzenberger theorem for wcfg over strong bimonoids than that 
of [11]. 


5 Weighted Right-Linear Grammars 


In this section we show that the well-known expressive equivalence of finite 
automata and right-linear context-free grammars holds also in the setting 
of bimonoids. More precisely, we consider weighted right-linear grammars 
over & and K and show that the class of their series coincides with the 
class of recognizable series over © and K. Such recognizable series were 
investigated in [8, 9] where they were called MK-fuzzy recognizable since the 
weight structure was the one used for the fuzzification of McCarthy-Kleene 
logic (cf. Example 1). 


Definition 10 A weighted right-linear grammar (wrlg for short) is a wefg 
G = (1, N,S,R,wt) over © and K whose rules are of the form A— aB or 
AraorA>ewithacXandBeN. 


By definition of the set of rules of a wrlg G, we get that every derivation 
of G is loop-free. 

We recall from [8, 9] the concept of weighted automata over bimonoids. 

A weighted automaton over © and K is a seven-tuple 


A= (Q,1,T, F,in, wt, fin), where 
- Q is the finite state set which is assumed to be linearly ordered, 


- IC Q is the set of initial states, 
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- TCQx =~ Q is the set of transitions, 

- F CQ is the set of final states, 

- in: I + K is a mapping assigning weights to the initial states, 

- wt: T > K is a mapping assigning weights to the transitions, and 


- fin: F > K is a mapping assigning weights to the final states of the 
automaton. 


Let w = ao...@n—1 be a word over % with ao,...,@n_1 € U. A path Py of A 
over w is a sequence of transitions Py :=((G, 4, G41) )pcjen—1 (Gi Us G41 ET 
for every 0 < i < n—1, with qo € I and q € F. The weight of Py is 
defined by 


weight (Pw) = in(qo) - II wt (qi, ai, Gi41) > fin(Gn) 
0<i<n—-1 


where in the factor [[pej;<n_1 Wt (Gi, 4, Gi+1) We multiply in an ascending 
order according to the usual ordering of natural numbers. 
The set of paths of A over w can be linearly ordered in the following way. 
For two paths Py = ((Gi, @i, Gi+1))ocien—1 and Pi, = (4:4: a1) ocicn—1 
we let 
Py ee iff GOO Siar dyed 4 


The behavior of A is the series ||.A|| : &* + K and it is defined as follows. 
Let w € Xt and {Pw,i,---;Pwm} be the set of all paths of A over w. 
Furthermore, assume that Py <...< Pym. Then, we set 


|| Al] (w) = weight(Py1) +... + weight(Pum).- 
If there are no paths of A over w, then we let ||.A||(w) = 0. If w =e, then 
I|All(e) = (in(ai.) + Fin (diy) +--+ 2Gin) + FIA) 


where IN F = {qi,,---,@im} and G@, <...<qi,,- LIN F =9, then we set 
| Al|(e) = 0. A series s : &* + K is called recognizable if there is a weighted 
automaton A over © and K such that s = ||A|l. 


Theorem 11 Let be a linearly ordered alphabet and K a bimonoid. Then 
a series s € K ((%*)) is generated by a wrlg iff it is recognized by a weighted 
automaton over % and K. 
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Proof: Let us assume firstly that s is recognized by a weighted automaton 
A = (Q,1,T, F,in, wta, fin). We construct the wrlg G = (%, N,S, R, wtg) 
over © and K in the following way. We let N = QU (I x Q) U{S} where S 
is a new symbol, and R= R, U RpU R3 UR, U Rs U Rg with 


- Ry = {S > a(qo,g) | a € I, (G0,4,4) € TH, 
- Ro = {(0,9) > ad’ | go € I,(4,4,¢) € TH, 
- R3 = {q > aq | (g,4,¢') € TH, 

- Rg={q>e|qe FH, 

- Rs = {(90,9) ae | 40 €1,q € F}, and 


Re = {Soe} if INFFO 
ede eee (i) otherwise. 


We define a linear order on % UQ by taking the orders of © and Q, and 
letting max < min@. This implies a linear order on the set R of rules. 
More precisely, we let: 


S—a(g,q)<S—a'(qg,q) iff agog <iex a'9q 
for every S > a(qo,q) € Ri, 
S—a'(g,7) € Ri, 
(go,9) > ap < (qd) > a'p! iff aqogp <iex a'qog'P’ 
for every (qo,¢) > ap € Rao, 
(q0,7) + a’p' € Ro, 
gag <q > a'ds iff aqige <tex aq 
for every q1 > ago € Rs 
q, > a’q, € Rs, 


qrae<dve iff q<qd' 
for every qe € Ra, 
qd >ee€ Ry, 
(q0,9) ES (a,d) ve iff God Sex 7 


for every (qo,¢q) ~€ € Rs, 
(9,7) rE E Rs, 


and 
max R,; < min Ro, 


max R2 < min Rz, 
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max R3 < min Ry, 
max Ry < min Rs. 


Moreover, we set S > ¢ < min R,; if Rg £0. 
The weight mapping wtg is defined for every rule r € R by 


in(qo)- wta(qo,a,q) ifr =(S — a(qo,q)) € Ri 


wta(q, a, q') if r = ((qo,q) > aq’) € Ro 
_ or r = (q— aq’) € R3 
wan fin(q) ifr=(q>e)e Ry 
or r = ((go, 9) + €) € Bs 
IAI) if r = ($4) 
Let w = a...Q@p-1 € DX with ap,...,an-1 € DU and 


Pw = ((di; Gi, Gi+1))o<i<n—1 be a path of A over w. By construction of 
the wrlg G there is a unique derivation d = ro...rn of G for w, which 
corresponds to P,,, such that 
S => ao(qo, 41) > anarqa Te) aay... Qn—19n =2> A0Q1 ...Gn—1 = Ww. 

Conversely, for every derivation d of G for w, there is a unique path 
Pw of A over w which corresponds to d. Furthermore, by a straightforward 
calculation we get 

weight(Py) = weight(d). 


Hence, there is a one-to-one correspondence among the paths Pyi,.-., Pw m 
of A over w and the derivations d1,...,dm of G for w. Next, we show that 


dy <lex +++ <lex ie 


iff 
Pps Seccetaye Pum: 

Indeed, let us assume that dj <jex dj+1 for some 1 < 7 < m—1. This implies 
that there is an index 0 < k < n—1 such that dj = r0...rg_irg.-.?n 
and dji1 = T0---Tk-1%---Ty With r, < ri. Let ro = (S > aol(go,a1)), 
r1 = ((90,91) > 4192), TT = (@ — aiqi41) for every 2 < 1 < n—1, and 
ne =: (Ue > e)e ences, Pg a= ((Gi, Qi, G41) o<icn—1: We distinguish the 
following cases. 


ek = 0. Then ro = (S$ + ao(%,%)), T = (CG) > 149); 
r, = (q — aq),,) for every 2 < 1 < n—1, and r, = (q, - ©). 


Weighted Context-Free Grammars Over Bimonoids 75 


By definition of order on R, we get qo < q, or go = q@ and 
qi <q. This implies that Pu j+1 = ((4, 4, d41)) on Page = 


(qo, ao, qi) (qd; ai, di41)) 1<i<n-1' 


0<i<n-1 


k = 1. Then, by our assumption we get r, = ((qo, 91) + a19q4) with 
gq < 4,7, =(q- aiN41) for every 2<1<n-—1, and ri, = (qd, ¢). 
Hence, Pwj4+1 = (qo, ao, an )(n, ai, qo) (qj, aj, 0) ere 


1<k<n-—1. Again by our assumption we get r, = (q, > Akh 41) 
with dp41 < Gai, T = (GQ > G4) for every K+1 <1 <n-1, 
and r, = (q, > €). Thus we get Puji1 = ((4i, i, di+1) )ocich—1 
(Gk, Ak, diet) ((d, Qi, Dern) Peete ae 


We conclude that Py j <tex Pw,j+1 in any case. The converse implication is 
shown with a similar argument. 


Trivially, if there is no path of A over w, then there is no derivation 


of G for w and vice versa. Therefore ||G||(w) = ||Al|(w). 


Next let w =e. If INF #9, then Re 4 O and ||G|\(e) = ||All(e), 


whereas if 19 F = @, then Rg = @ and both of ||G||(€), ||Al](e) equal to 0. 
We conclude that ||G|| = |All. 


Assume now that s is generated by a wrlg G = (©, N, S, R, wtg) over © 


and K. We consider a new symbol E ¢ N and construct the weighted 
automaton A = (Q,/,T,F,in, wty, fin) over © and K with 


Q={(A,r)| AEN andr=(A>u)€ RS U{E}, 
IT={(S,r)|r=(S > u)e€ RB}, 


T = {((A,r),a,(B,p)) |r = (A> @B) € R,p=(B>u)€ RYU 
{((A,r),a,£)|r=(A- a) € RB}, 


F={(A,r)|r=(A-e) € R}U{E}, 
in(q) = 1, for every q € I, 
wt,((A,7r), a,q) = wtg(r) for every ((A,r),a,q) € T, and 


oC { wtg(r) if g= (A,r) 


1 ppb , for every q € F. 


We define a linear order on Q by letting 


(A,r) < (A’,r’) id: (Ges r’) 
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for every (A,r), (A’,r’) € Q\ {E} and maxQ = E. 

Let now w = ag...@n_1 € &* with ao,...,a@n—1 € © and d a derivation 
of G for w. Since G is wrlg, there are rules r; = (A; > ajAi4i) € R, 
0<i<n-—2, with Ag = S' such that 


Tn-2 
S a. ag Ay ae. aga, Ag , 12. ==> 0a... An—2An—1- 
We distinguish two cases: 


i) there is a rule rn_1 = (An_1 > Gn-1) € R, 
Tine 
hence we get aga1 ...dn—2An—1 ==> G9... Gn—1, OF 
ii) there are rules rn_1 = (An—1 > Gn—1An) € R, Tn = (An > €) € R, 
Trix A 
hence we get aga1 ...Qn—2An—1 ==> 40... Gn—1An => ag... Qn—1. 


By construction of the weighted automaton A, we get respectively the paths 


) (C9570), @0, (A1,71))((A1, 71), a1, (Az, 72))--- 
An-1; Paces An-1; E), or 


( 
(S, ro), ao, (Ai, r1))((A1, 71), a1, (Az, 12)... 
(An-1, Ticks) An-1; (An, Trays 


i/ 
ii’) 


( 
( 
( 
( 
Furthermore, by a straightforward calculation, we get respectively 


i") weight(d) = wtg(ro) -...- wtg(Tn-1) 
= in(S,ro) - wt,((S, 70), do, (A1,71))°---° 
wt ,a((An-1,Tn—-1); On—-1, £) > fin(E) 
= weight(Pw), or 
ii”) weight(d) = wtg(ro)-...- wtg(rn—-1) - wtg (rn) 
= in(S,ro) > wta((S, 70), a0, (A1,71))----> 
wt a (Ag-13 nai); @n15 (Anita) ofin Anita) 
= weight (Py). 


By similar arguments we can show that for every path P,, of A over w there 
is a unique derivation d in G for w with the same weight. Furthermore, 
if dj,...,dm are all the derivations of G for w and Py1,..., Pum are the 
corresponding paths of A over w, then by standard calculations we get 


dt Saye ee Stax One 
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iff 

Pwd eS Pum: 
Hence, we derive ||.A||(w) = ||G||(w). If D(w) = 0, then obviously there are no 
paths of A over w, thus both ||.A||(w) and ||G||(w) equal to 0. Finally, assume 
that there is a rule r = (S > ¢) € R. Then, by definition of J and F’, we get 
INF = {(S,r)} hence, ||.Al|(e) = in(S,r)- fin(S,r) =1- wtg(r) = ||G||(e). 
We conclude that ||A|| = ||G||, and our proof is completed. 


Conclusion 


We introduced and studied wcfg over an alphabet © and an arbitrary bi- 
monoid K. For our work, we were motivated by recent needs, of weighted 
computational models over bimonoids, for practical applications [20, 21]. 
As our main results, we showed that for every wcfg we can effectively con- 
struct an equivalent one in Chomsky normal form, and proved a Chomsky- 
Schtitzenberger type result for the class of series generated by our grammars. 
Furthermore, we proved in our setting a well-known result relating the 
notions of recognizability and context-freeness, namely the class of series 
generated by weighted right-linear grammars coincides with the class of 
recognizable series over 4% and Kk. For this, we required the input alphabet © 
to be linearly ordered. Several problems remain open. More precisely, the 
closure of the class of series of wefg with scalars from the left, closure under 
Cauchy product and Kleene star. Especially, the last two operations are 
not defined in a unique way for series over bimonoids (cf. [8, 9, 26]) due 
to the lack of commutativity of the operations. It is an open question also, 
whether the Hadamard product of a context-free series with a recognizable 
series over } and K is still a context-free series. As our future work, we 
state the investigation of weighted pushdown automata with weights in K, 
which turns to an interesting problem for practical applications. 
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